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Abstract 

On a high-frequency scale the time series are not homogeneous, therefore standard 
correlation measures can not be directly applied to the raw data. There are two 
ways to deal with this problem. The time series can be homogenised through an 
interpolation method [1] (linear or previous tick) and then the Pearson correlation 
statistic computed. Recently, methods that can handle raw non-synchronous time 
series have been developed [2,4]. This paper compares two traditional methods that 
use interpolation with an alternative method applied directly to the actual time 
series. 
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1 Introduction 

In this paper we present and compare tliree different methods of comput- 
ing the cross correlation matrix from high frequency equity trades data. The 
component stocks of the S&PlOO are employed in analysing the NYSE Trades 
and Quotes (TAQ) database. In the context of this paper, high-frequency data 
is defined as the raw time series of trades. The time interval between transac- 
tions ranges from zero seconds (several distinct trades recorded at the same 
time) to forty minutes. 

An extension of the standard Pearson correlation measure is proposed in 
[1] by incorporating a "covolatility weighting" for the time series. The weight 
has the role of emphasizing periods where trading has a noticeable effect on 
asset prices. 

Let X, Y be two asset price time series which have been homogenised and 
synchronised to a time step At, covolatility weights are given by uji and time 
length of the trading period is T. We define Ax, Ay as the corresponding log 
returns series on a time scale At and AX, AY as the log returns on a larger 
time scale mAt. The covolatility adjusted correlation measure is defined as: 
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where Ui = J2i\^^i-m-j - Ax^.^l • \Ayi.m-j - Ay^.^|)", (2) 
i=i 

Setting a;i = 1 reduces (1) to the standard Pearson coefficient. In this paper 
as in [1] a — 0.5 but this can be varied so that more weight is given to periods 
where the returns volatihty is above average. In [1] m = 6, in our analysis 
it varies from 3 to 480 (the number of time units of At in the trading day). 
This was determined by the choice of At = 60 seconds which was taken as 
a tradeoff value for the average trading interval pattern. The intention is to 
avoid extensive imputation towards the end of the trading day when there are 
few transactions occurring. 

Methods that can be directly applied to the actual time series to obtain 
correlation statistics have been presented in [2,3,4]. The method by de Jong [4] 
is based on a regression type estimator but it relies on a rather strong assump- 
tion of independence between prices and transaction times. Barucci and Reno 
[2.3] have adapted a Fourier method developed by Malliavin [5] to the compu- 
tation of FX rates correlations. The Fourier method is model independent, it 
produces very accurate, smooth estimates and handles the time series in their 
original form without imputation or discarding of data. A rigorous proof of 
the method is given in the original paper by Malliavin [5] so only the main 
results arc given below. 

Let Siit) be the price of asset i at time t and = \nSi(t). The phys- 
ical time interval of the asset price series is rescaled to [0, 27r] . The vari- 
ance/covariance matrix Ej^ of log returns is derived from its Fourier coefficient 
ao(Ejj) which is obtained from the Fourier coefficients of dpi. 

dkidpi) = - cos{kt)dpi{t), bk{dpi) = - sm{kt)dpi{t), k>i. (4) 
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In practice, the coefficients are computed through integration by parts: 

pf27r) — p(0) k '^f k T 
ak{dpi) — 1 — / sm{kt)pi{t)dt, bk{dpi) = / cos{kt)pi{t)dt. 



The Fourier coefficient of the pointwise variance/covariance matrix is : 

T/2t 

ao(Ey) = hni — E [ak{dpi)ak{dpj) + bk{dpi)bk{dpj)]. (5) 
^ k=i 
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The smallest wavelength (T/2r) that can be analysed before encountering 
aliasing effects is determined by the lower bound of r (time gap between 
two consecutive trades) which is 1 second for all S&PlOO price series. The 
integrated value of E^- over the time window is defined as afj — 27rao(Sjj) 
which leads to the Fourier correlation matrix pij — a^j/{aii- d'jj)- 



2 Results 



We tested the Fourier method on simulated bivariate G ARCH (1.1) pro- 
cesses in a similar setting to that in [2]. The time interval 6 between trades 
in S&PlOO equities approximately follows an exponential distribution with 
rate parameter /? in the range 1 (very liquid stock) to 22 (least liquid stock) 
seconds. We sampled the generated GARCH process using the exponential 
distribution and varied f3 so as to resemble actual trading patterns. 

The method works very well on synchronous series with random gaps irre- 
spective of the rate f3. The tests on asynchronous series with (3 < 6 were also 
successful for the entire correlation spectrum. When (3 > 10 for at least one 
of the two series, the correlation decays noticeably at time scales smaller than 
5 minutes but converges quickly to the induced value when the time scale is 
greater than 10 minutes. The correlation decay in the high-frequency regime 
seems to be directly related to the rate parameter 
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Fig. 1. Time scale values correspond to m for Pearson and Covolatility Adjusted methods and t for Fourier 
method. The rate parameters are Pi„tel=^, /3cisco=l-2 in Fig.l.(a), Pheinz=^S, (^campbell='^'^ in Fig.l.(b). 

In Fig.l the correlation spectra for two pairs of stocks is shown computed 
with each of the three correlation measures. In both cases the Fourier correla- 
tion method provides a much smoother spectrum than the other two methods 
(Pearson and Govolatility Adjusted) which use interpolation. The "Epps ef- 
fect " ^ can also be observed in the two plots and displays one of the properties 
described in [7] , the more an asset is traded, the less marked the Epps effect is. 



The correlation between stocks falls when decreasing the time scale [6]. 
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The correlation between Intel and Cisco (highly liquid, Pintei='^, Pdsco—^-'^) 
reaches a stable level after approximately 15 minutes whilst for Heinz and 
Campbell (lower liquidity, (3heinz=^'^, Pcampheii='^^) it takes about 2 hours to 
stabihse. 

The Fourier method averages the Covolatility adjusted correlation measure 
at very high frequency (time scale under 20 minutes) and trails a moving av- 
erage of the Pearson coefficient at lower frequencies. This indicates that the 
method is also robust in the low frequency domain where the Pearson method 
can be taken as the benchmark. The correlation decay observed in the GARCH 
tests at time scales of less than 5 minutes due to non-synchronicity in trades 
can not account for the correlation structure that develops over 2 hours in 
Fig.l.(b). Thus, the Epps effect present in the correlation structure of illiquid 
stocks can not be explained by non-synchronicity in transactions but is an 
actual market microstructure phenomenon related to the information aggre- 
gation and price formation processes. Prom the analysis carried out it can be 
inferred that the Fourier method of computing the correlation matrix from 
high-frequency data is better than the alternatives in terms of generating 
smooth, robust estimates. It is conceptually superior to methods that use in- 
terpolation and is also model independent. Further studies are under way [8] 
to explore other contributing factors to the Epps effect, the impact of trad- 
ing synchronicity on the correlation measure and the time-scale dynamics of 
correlation matrices. 
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